Comparative effectiveness of adding delamanid to a multidrug-resistant tuberculosis regimen comprised of three drugs likely to be effective

Clarity about the role of delamanid in longer regimens for multidrug-resistant TB is needed after discordant Phase IIb and Phase III randomized controlled trial results. The Phase IIb trial found that the addition of delamanid to a background regimen hastened culture conversion; the results of the Phase III trial were equivocal. We evaluated the effect of adding delamanid for 24 weeks to three-drug MDR/RR-TB regimens on two- and six-month culture conversion in the endTB observational study. We used pooled logistic regression to estimate the observational analogue of the intention-to-treat effect (aITT) adjusting for baseline confounders and to estimate the observational analogue of the per-protocol effect (aPP) using inverse probability of censoring weighting to control for time-varying confounding. At treatment initiation, 362 patients received three likely effective drugs (delamanid-free) or three likely effective drugs plus delamanid (delamanid-containing). Over 80% of patients received two to three Group A drugs (bedaquiline, linezolid, moxifloxacin/levofloxacin) in their regimen. We found no evidence the addition of delamanid to a three-drug regimen increased two-month (aITT relative risk: 0.90 (95% CI: 0.73–1.11), aPP relative risk: 0.89 (95% CI: 0.66–1.21)) or six-month culture conversion (aITT relative risk: 0.94 (95% CI: 0.84, 1.02), aPP relative risk: 0.93 (95% CI: 0.83, 1.04)). In regimens containing combinations of three likely effective, highly active anti-TB drugs the addition of delamanid had no discernible effect on culture conversion at two or six months. As the standard of care for MDR/RR-TB treatment becomes more potent, it may become increasingly difficult to detect the benefit of adding a single agent to standard of care MDR/RR-TB regimens. Novel approaches like those implemented may help account for background regimens and establish effectiveness of new chemical entities.


Introduction
An estimated 10 million people globally developed tuberculosis (TB) disease in 2021 [1]. Of these, nearly 500,000 became sick with a strain resistant to at least rifampin (rifampin resistant TB, RR-TB), or rifampin and isoniazid (multidrug-resistant TB, MDR-TB), the most effective first-line drugs to treat TB [2]. World Health Organization (WHO) guidance on treatment for MDR/RR-TB at the time of study conduct recommended a shorter (9-11 month) seven-drug regimen and longer (18-24 month) regimens of at least four drugs likely to be effective. Many of the drugs that comprise these regimens cause debilitating side effects [3][4][5].
The introduction of delamanid (OPC-67683, Deltyba)-one of the first drugs with a novel mechanism of action against M. tuberculosis in nearly 50 years-offers the potential to improve treatment for RR/MDR-TB. In 2014 the European Medicines Agency (EMA) conditionally approved delamanid for TB [6]. Results of the Phase IIb randomized controlled trial (RCT) (ClinicalTrials.gov identifier NCT00685360) showed delamanid to be efficacious: at two months, 45.4% of patients in the delamanid-plus-standard treatment arm experienced culture conversion vs 29.6% in the standard treatment arm (p = 0.008) [7]. However, the Phase III trial (ClinicalTrials.gov identifier NCT01424670) did not demonstrate a clinically relevant or statistically significant difference: 87.6% in the delamanid-plus-standard treatment arm experienced conversion by six months vs 86.1% in the placebo-plus-standard treatment arm [8].
One major difference between the trials was the composition of the background regimen. Phase II trial participants received four to five drugs on average, as compared to a mean of 6.5 drugs in Phase III trial. And, Phase III trial regimens tended to be more potent, which could have potentially masked delamanid's contribution to treatment outcomes. The conflicting findings and different treatment regimens used across these trials beg for further investigation of delamanid's role in MDR-TB regimens.
Based on the Phase III trial results, delamanid was categorized as a lower priority drug, to be used if a regimen with more effective drugs cannot be composed (e.g., resistance, intolerability) [9,10]. There have also been calls for studies of delamanid in regimens compromised by resistance or intolerability, two features often resulting in patients receiving too few effective drugs [11]. However, evidence to date has not been from robust comparative effectiveness studies. In an early descriptive report of 66 patients with limited treatment options receiving delamanid under compassionate use, 80% were culture negative at six months [12]. Patients had, on average received 3.3 likely effective drugs. This treatment response-which far exceeded that from historical cohorts treated without delamanid-suggested delamanid may be beneficial for patients on few drugs. Subsequent descriptive studies of patients treated with delamanid showed similar early success, with 70-95% of patients achieving conversion within six months of delamanid initiation [13][14][15][16][17]. However, owing to a lack of comparative studies, the role of delamanid when added to a regimen containing fewer than four drugs, the number recommended by WHO for longer individualized treatment, remains an open question.
Here, using a robust analytic design and methods grounded in causal inference, we evaluated the comparative effect on two-and six-month culture conversion of adding versus not adding delamanid for 24 weeks to MDR-TB regimens comprising only three likely effective drugs instead of the four recommended by WHO at the time of the study.

Data source and study population
We used data from the endTB observational cohort (ClinicalTrials.gov identifier NCT03259269), a prospective research cohort across 17 countries, and included participants with a positive culture and documented MDR/RR-TB at enrollment. Full details on the study protocol have been published previously [18]. In summary, participants were treated under routine programmatic conditions with a longer multidrug regimen including bedaquiline and/or delamanid, in accordance with guidelines of their respective countries and of WHO during the study period (2015-2020) [19,20]. Clinical care was further informed by the endTB clinical guide [21]. Research activities were directed by a common protocol [18]. Data were collected using standardized forms and adverse events were monitored through a unified pharmacovigilance system [22].

Design of comparative effectiveness analysis
We designed our analysis using target trial emulation to answer the causal question of interest: [23][24][25][26][27][28] what is the comparative effectiveness of adding delamanid for 24 weeks to an MDR-TB regimen of three drugs likely to be effective? We first specified a hypothetical, pragmatic "target" RCT (Appendix A in S1 Text). We then emulated this target trial with our observational data and conducted a statistical analysis to control for potential biases.

Outcome
Culture conversion is used as an interim microbiological indicator and surrogate endpoint in both observational studies and RCTs [29,30]. We assessed two-and six-month culture conversion risks. We defined culture conversion as the first of two consecutive negative cultures collected �15 days apart. Participants who died or were lost to follow-up (LTFU) before conversion were considered as not having converted. LTFU was defined as treatment interruption (i.e. no treatment) for �2 months.

Eligibility criteria
We included participants with a positive baseline sputum culture, defined as any culture on a sputum specimen collected �90 days before treatment initiation in the endTB cohort [31,32]. We excluded patients treated in the Democratic People's Republic of Korea due to substantial differences in diagnosis, treatment delivery, and lack of HIV testing, compared to the rest of the cohort. Likely effectiveness of a drug was considered established if: (1) resistance testing indicated the participant's M.tb strain was not resistant to the drug, or (2) no resistance testing had been conducted and the participant had not previously received the drug for one month or more, according to the medical record. All drugs in the WHO hierarchy were considered. Baseline regimens (i.e., those prescribed at the end of first week of treatment) were categorized as follows, based on the number of likely effective drugs and irrespective of WHO drug group hierarchy: [5] (1) receiving a regimen of delamanid plus a background regimen of three likely effective drugs, (2) receiving a regimen of exactly three likely effective drugs, none of which was delamanid, or (3) neither 1 or 2 (excluded from analyses). A three-drug regimen was used as the comparator because such regimens did not conform to WHO recommendations of the time, and we hypothesized that, among patients receiving three-drug regimens, culture conversion could be hastened with delamanid. In this cohort, a patient may have received a regimen of three likely effective drugs when other options were not available, because of high drug resistance, adverse events, or unavailability of drugs. Applying these criteria resulted in regimens primarily comprised of bedaquiline, linezolid, levofloxacin/moxifloxacin, and clofazimine.

Statistical analysis
We estimated the observational analogue of the intention-to-treat (aITT) effect and the observational analogue of the per-protocol effect (aPP). The analysis estimating the aITT effect includes all participants, classified by their baseline treatment, and adjusted for baseline confounders. This analysis estimates the effect of initiating delamanid plus a background regimen of three drugs versus initiating a background regimen of three drugs, not including delamanid. Treatment could change during follow-up. For some participants in the delamanid-containing group, delamanid was discontinued; for some participants in the delamanid-free group, delamanid was started; and in both groups, some patients experienced changes in the number of background drugs. The aPP effect estimates the effect of adding and remaining on delamanid among participants who received a regimen of three likely effective drugs for the duration of follow-up (up to 24 weeks). Because MDR-TB treatment can vary over time, aPP analyses may be biased by time-dependent confounding [33][34][35].
Estimating the intention-to-treat analogue relative risk and risk difference of culture conversion. We fitted a pooled logistic regression model and its predicted probabilities to estimate the risk of conversion for delamanid-containing versus delamanid-free regimens. The model was adjusted for the following baseline confounders chosen a priori using content knowledge and directed acyclic graphs: age, sex, hospitalization at treatment initiation, the number of Group A drugs (i.e., those classified as priority drugs in the 2020 WHO MDR/ RR-TB guidelines including bedaquiline, linezolid, moxifloxacin/levofloxacin), whether the patient was receiving imipenem-cilastatin, body mass index <18.5, HIV infection, and hepatitis C antibody positivity. Confidence intervals were estimated using nonparametric bootstrapping with 500 samples [36]. Missing data were rare for most confounders (<1%), with the exception of baseline cavitation on chest radiography (Table 1). Primary analyses were complete case.
Estimating the per-protocol analogue relative risk and risk difference of culture conversion. To estimate the aPP effect, we artificially censored participants when their treatment deviated from that administered at baseline. To adjust for selection bias due to this artificial censoring, we applied inverse probability of censoring weights [37].
To simulate the per-protocol population, we censored observations in the delamanid-containing group when delamanid had been discontinued for >2 consecutive weeks prior to the end of treatment and there was no evidence that delamanid discontinuation was in response to an adverse event. Therefore, estimated effects reflect continuation of delamanid for up to 24 weeks, unless contraindicated due to an adverse event. In the delamanid-free group, we censored observations when delamanid was added for >2 consecutive weeks. In both groups, observations were censored if drugs were added or removed from the regimen such that it contained either less than three or more than three drugs likely to be effective for >2 consecutive weeks. To control for selection bias due to artificial censoring, for each individual and for each week, we estimated time-varying inverse probability of censoring weights equal to the inverse of the probability of being uncensored, i.e. maintaining a regimen consistent with the baseline treatment group.
To estimate time-varying weights, we fitted a pooled logistic regression model to estimate the probability each participant remained on their baseline treatment (i.e., was not censored) conditional on time-varying predictors of changing treatment and time since baseline. These predictors included time-varying number of Group A drugs according to WHO classification, sputum smear result, number of adverse events, hospitalization, time, and a quadratic function of time (Primary Model, Appendix B in S1 Text). Full detail on the derivation of weights is provided in Appendix C in S1 Text.
Using a weighted logistic regression model adjusted for baseline confounders of treatment, we estimated the predicted probabilities of culture conversion for each uncensored participant. We then used the mean predicted probability of conversion by treatment group to calculate the point estimate for the relative risk and risk difference. Confidence intervals were calculated using nonparametric bootstrapping with 500 samples.

PLOS GLOBAL PUBLIC HEALTH
Adding delamanid to a 3-drug MDR-TB regimen Sensitivity analyses. To account for the possibility that adjustment for the number of Group A drugs did not adequately control for the efficacy of the background regimen across treatment groups, we conducted a sensitivity analysis among patients who had received bedaquiline. We restricted the delamanid-containing group to those who also received bedaquiline.

Ethical approval
We obtained approvals from the central ethics review committees for each consortium partner and local ethics committees in each country. This study was approved by the Partners Health-

Results
Between April 1, 2015 and September 30, 2018, 2757 patients were initiated on a first MDR/ RR regimen containing bedaquiline and/or delamanid and consented to participate in the endTB observational study (Fig 1). We excluded 1999 (72.5%) participants whose baseline regimen did not correspond to a treatment group of interest. Patients who did not have RR/ MDR-TB (n = 5), had a negative or missing baseline sputum culture (n = 359), or were treated in the Democratic People's Republic of North Korea (N = 32) were excluded, leaving 362 participants (N = 123 delamanid-containing, N = 239 delamanid-free) in the aITT cohort (Fig 1).
Most patients were treated in Kazakhstan (30.1%), Georgia (16.9%), Peru (14.1%), or Pakistan (12.2%) ( Table 1). Treatment groups were comparable in age, sex, and indicators of disease severity such as cavitation, bilateral disease, and smear grade; however, missing data on cavitation and bilateral disease was more common for participants in the delamanid-free group. The delamanid-containing group had a greater proportion of participants with comorbidities (Table 1).
Although participants in both groups received a background regimen of three drugs likely to be effective, there was substantial heterogeneity in companion drugs (Appendix D in S1 Text). On average, participants in the delamanid-containing group had fewer Group A drugs. In the delamanid-free group, 10.0% of participants received all three Group A drugs and 82.4% received two Group A drugs. In the delamanid-containing group, no participant received all three Group A drugs and 58.5% received two Group A drugs ( Table 1).
The baseline treatment regimen was maintained for 24 continuous weeks (Fig 2) in 63.5% of participants. Of the 132 whose treatment regimen changed, 33 were short term (<2 weeks). In one additional participant, delamanid was removed due to an adverse event. The remaining 98 participants had regimen adjustments that resulted in a treatment group change and censoring due to addition of delamanid (n = 8); delamanid withdrawal without a documented, related adverse event (n = 5); and a background regimen change (n = 85, Fig 2). The distribution of censoring weights is shown in Appendix B in S1 Text. Adjusted aPP estimates were calculated from 348 of 362 participants with complete data ( Table 2).

Discussion
In the context of regimens containing exactly three likely effective drugs, many of which comprised at least two Group A drugs and clofazimine, we found no evidence of an effect of adding delamanid on culture conversion within two and six months. This finding is consistent with the previously reported Phase III delamanid trial but different from that of the Phase IIb trial [7,8]. A major strength of our study is that we collected and used longitudinal data on treatment changes and time-varying risk factors that have historically not been represented in observational TB cohorts [38]. This, in combination with an approach rooted in target trial emulation, allowed us to answer a precise comparative effectiveness question and account for potential biases that could not be addressed in prior observational studies of MDR/RR treatment.
Our treatment groups were defined by the quantity of drugs in a regimen. However, the efficacy of drugs in the regimen cannot be ignored. The advent of bedaquiline, repurposed drugs, such as linezolid and clofazimine, and late generation quinolones, such as levofloxacin/ moxifloxacin, have transformed the TB treatment landscape. Regimens comprised of drugs with lesser efficacy, like those used in the Phase IIb trial that showed a significant effect of delamanid, [7] are not represented in large numbers in the endTB cohort. For example, in the Phase IIb trial, many patients received pyrazinamide, kanamycin, cycloserine, and ethambutol. In contrast, regimens in our study included bedaquiline, and made even greater use of other potent drugs like linezolid and moxifloxacin/levofloxacin than did the Phase III trial, [8,39] which, similarly, found no effect of delamanid on median time to culture conversion over six

Intention-to-treat analogue (aITT)
Logistic months. Thus, it is not surprising our findings resemble those from the latter study. Whether delamanid can improve the effectiveness of regimens compromised by toxicity or resistance to Group A drugs, or prevent acquired resistance through protection of these drugs, remains unanswered. Future MDR-TB treatment research aimed at comparative effectiveness should examine both the number and efficacy of drugs in a regimen. Delamanid-free regimens comprised of only three, primarily Group A drugs rather than the recommended four drugs performed exceedingly well, with >90% culture conversion at six months. These findings point to the possibility that three potent drugs may be sufficient in many patients. In addition, that these findings corroborate those of the Phase III trial, and are at odds with the Phase II, raises red flags about the strategy used for evaluating both delamanid and bedaquiline: each was added as a single drug to a background regimen. The relative efficacy of bedaquiline compared to placebo in the Phase IIB trial was even more pronounced than that of delamanid [40]. It is critical to note the extremely poor performance of background regimen plus placebo in the bedaquiline trial: in the placebo plus background regimen, only 9% of participants experienced culture conversion at 8 weeks. In comparison, in the delamanid Phase IIB trial, 54% of control-arm participants experienced this outcome. The improved standard of care likely contributed to the equivocal results in the Phase III trial of delamanid and the present observational study analysis. The confirmatory Phase III trial of bedaquiline has altered the approach, by considering the potential for bedaquiline to contribute to treatment shortening and bedaquiline as a replacement for a toxic, injectable drug, kanamycin (ClinicalTrials.gov Identifier: NCT02409290). As RR/MDR-TB treatment effectiveness increases, quantifying the contribution of any singular drug will become increasingly difficult, underscoring the importance of evaluating regimens, rather than individual drugs, an approach adopted by several recent pivotal trials [41,42].
This study highlights important considerations for investigators analyzing MDR-TB treatment cohorts. Analyses of baseline regimen compositions might not answer the most relevant clinical question when regimen composition commonly changes throughout the course of treatment [43].
We did not identify a clinically meaningful difference between results of standard baselineadjusted models-which estimates the aITT-and the inverse probability censoring weighted analysis estimating the aPP. This tells us that, in this cohort, the effect of adding delamanid to a three-drug regimen at baseline is similar to that of adding and maintaining delamanid in a three-drug regimen for the first six months of treatment. The likely reason for not identifying differences between these analyses is that among patients who changed their regimen, changes to the primary drug of interest, delamanid, were rare (13/98, 13.3%). Only one patient discontinued delamanid due to an adverse event, reinforcing the safety of delamanid and potential use of delamanid in regimens compromised by toxicity. Changes in the number of likely effective background drugs in the background regimen were more common; however, meaningful differences between estimates will be observed if censoring (i.e., change in the baseline regimen) is highly (many-fold) associated with both the treatment group (i.e., delamanid at baseline) and the outcome. This was not the case in this analysis (Appendix E in S1 Text) [43,44]. While we did not observe a meaningful difference between aITT and aPP estimates, when the objective is to estimate the effect of starting and remaining on a treatment, the approach applied here is one strategy that also resolves the potential for time-dependent confounding [37]. Target trial emulation is an intuitive framework to assist investigators through the steps of identifying the research question, the treatment groups to be compared, and an analytic approach that will produce an estimate of the desired causal effect [26].
A limitation of this analysis is that, despite narrowly defining our treatment groups, and adjusting for the number of Group A drugs, there may be residual differences in the quality of the background regimen across the groups. There was an imbalance of the number Group A drugs in the three-drug regimen without delamanid (92.4% had 2 or 3 Group A drugs) and three-drug regimen plus delamanid (58% had 2 or 3 Group A drugs). Few efforts have been made to meaningfully capture the heterogeneity of individualized MDR-TB regimens in comparative effectiveness studies, likely because hundreds or even thousands of distinct regimens can be represented in any one cohort. For example, the 2012 individual patient data meta-analysis comprises over 9000 patients on 1626 different baseline regimens [45]. Further methodologic work in this area is needed, as global treatment guidance relies largely on observational cohorts for the evidence base [45][46][47]. Unmeasured confounding by non-regimen factors (patient-level factors such as demographics or disease severity) is also possible, though less likely because we collected and adjusted for a multitude of baseline and time-varying factors. Lastly, culture conversion is an imperfect predictor of final treatment outcome, therefore we cannot conclude the association of delamanid with the proportion cured.
In conclusion, although, we did not identify a benefit for 2-or 6-month culture conversion of adding delamanid to an MDR/RR-TB regimen with only three likely effective, and often highly-potent drugs, the rarity of delamanid suspension reinforces existing evidence about its safety. Important questions remain about how to optimize the use of delamanid, including whether delamanid can improve the effectiveness of regimens comprised of drugs with suboptimal efficacy or improve the safety (or efficacy) of treatment through substitution for more toxic (less potent) agents. Our findings also highlight the risk of equivocal results if the drugdevelopment approach used for bedaquiline and delamanid is applied to new chemical entities in the context of the current, improved background regimen. Finally, the analytic methods used here can facilitate articulation of precise research questions and should be considered as a strategy for reducing bias in analysis of MDR/RR-TB regimens, when treatment changes are frequent.